Investigation of the Full Configuration Interaction Quantum Monte Carlo Method 

Using Homogeneous Electron Gas Models 



Or 



James J. ShepherdQ George H. Booth, and Ali Alav^ 

University of Cambridge, Chemistry Department, Lensfield Road, Cambridge CB2 1EW, U. 

(Dated: May 8, 2012) 



K, 



Using the homogeneous electron gas (HEG) as a model, we investigate the sources of error in 
the 'initiator' adaptation to Full Configuration Interaction Quantum Monte Carlo (i-FCIQMC), 
with a view to accelerating convergence. In particular we find that the fixed shift phase, where 
the walker number is allowed to grow slowly, can be used to effectively assess stochastic and ini- 
tiator error. Using this approach we provide simple explanations for the internal parameters of 
an i-FCIQMC simulation. We exploit the consistent basis sets and adjustable correlation strength 
of the HEG to analyze properties of the algorithm, and present finite basis benchmark energies 
for TV" = 14 over a range of densities 0.5 < r s < 5.0 a.u. A single-point extrapolation scheme is 
introduced to produce complete basis energies for 14, 38 and 54 electrons. It is empirically found 
that, in the weakly correlated regime, the computational cost scales linearly with the plane wave 
basis set size, which is justifiable on physical grounds. We expect the fixed shift strategy to reduce 
the computational cost of many i-FCIQMC calculations of weakly correlated systems. In addition, 
we provide benchmarks for the electron gas, to be used by other quantum chemical methods in 
exploring periodic solid state systems. 



PACS numbers: 



I. INTRODUCTION 



The simulation-cell homogeneous electron gas (HEG), 
consisting of N electrons in a finite and periodic box 
of length L, with a uniform neutralizing background, 
should be a compelling choice of model system for 
quantum chemical studies. Any determinant comprised 
of plane waves is an exact Hartrcc-Fock solution as 
well as the exact natural orbital representation for the 
gasi, and Hamiltonian matrix elements are analytically 
computable^. A single tunable density parameter (r s ) 
controls the strength of coupling, and the gas is then rep- 
resentative of wide range of weakly to strongly correlated 
electronic problems^. 

The complete one-particle space comprises of an infi- 
nite set of plane waves, and a common choice of wavevec- 
tors in a finite basis is a gridded sphere centred on the 
origin of reciprocal space. Since it provides a basis set 
tunable with only one parameter, a reciprocal space ra- 
dial cutoff k c , the limit k c — >■ oo, corresponding to the 
complete basis set limit, can be approached systemati- 
cally and straightforwardly, with no need to rc-optimise 
the orbitals between changes in basis set, since the Fock 
operator does not couple any spin orbitals. 

In spite of these apparent advantages, there has been 
little investigation of HEG using quantum chemical 
methods. Perhaps one drawback preventing such work 
is that expectation values of such a finite TV-electron gas 
differ from that of the infinite system and as such the 
thermodynamic limit, A" — > oo with the density held con- 
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stant, needs to be found to converge on physical charac- 
teristics accurately. Small- N simulation-cell gases suffer 
from so-called finite size effects, which can produce non- 
physical behavior in simulations^. Nevertheless, finite 
size corrections have been developed that allow physical 
electron-gas behavior to be observed for increasingly low 
electron numbers^r— . 

The 'exact' solution to the electronic Schrodinger equa- 
tion for a finite basis can be solved by expanding the 
wavefunction as an optimized linear combination of all 
Slater Determinants that can be formed from rearrang- 
ing N electrons in M basis functions. If these are found 
by an exact diagonalization, this method is referred to as 
full configuration interaction (FCI) and scales combinato- 
rially in A" and M (Ref. [^). Truncated CI techniques, re- 
stricting the calculation to a subset of the space, although 
potentially polynomially scaling, would yield zero corre- 
lation energy per electron in the thermodynamic limit 
due to lack of size extensivityi^. This makes treatment 
of the HEG of a modest size in even a tiny basis set 
intractable. 

FCI Quantum Monte Carlo and its 'initiator' adapta- 
tion (i-FCIQMC) are novel methods developed in a series 
of recent papers^ - — in which the FCI equations are sim- 
ulated by representing the determinant coefficients as a 
set of walkers evolving over discretized imaginary time. 
This allows much larger Hilbcrt spaces to be studied, 
with the largest space accurately sampled to date be- 
ing 10 108 , in a previous study of the 54-electron gasi£. 
In this study, the high-density gas was explored using 
i-FCIQMC to yield energies of, in principle, FCI accu- 
racy. The error incurred by using a finite basis was re- 
moved by an extrapolation scheme proposed by the au- 
thors and to be expanded on in a forthcoming paper. 
Comparison between our energies and those of recent 
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diffusion Monte Cario (DMC) calculations, based on a 
similar methodology as the famous study of Ceperley 
and Alder—, were consistent with the claim that mod- 
ern DMC energies for finite electron gases are thought to 
be accurate to within lmEh per electro n 18 ' 19 . 

It is our intention to continue to explore the use of 
i-FCIQMC to study the electron gas. In this paper we 
seek to use the advantages presented by the HEG to 
better-understand the method itself, and its potential 
application to periodic systems. In particular, we show 
that the only approximation made in i-FCIQMC , aris- 
ing from using a finite number of walkers, is rigorously 
controllable and can be removed in a systematic fashion 
with the use of the fixed shift strategy. This is a minor 
adaptation to the current algorithm to achieve FCI accu- 
racy reliably, and comparatively cheaply compared to a 
previous study. In doing so, we will also expose some of 
the benefits of using the HEG for studies using quantum 
chemical methods, and provide much-needed literature 
benchmarks. 
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and where g is the change in the one-particle momen- 
tum due to the excitation ij — > ab. The remaining 
term, «m, is the Madclung term, which represents con- 
tributions to the one-particle energy from interactions 
between a point charge and its own images and a neu- 
tralising backgrounds^. This is an artifact of performing 
a simulation-cell calculation and vanishes in the thermo- 
dynamic limit. The term in vm also cancels between the 
total FCI energy and the Hartree-Fock energy, making 
the (FCI) correlation energy independant of its value. 

The FCI solution to the Schrodinger equation ex- 
pressed in a basis of spin orbitals can be written as an 
optimized linear combination of Slater Determinants, 
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II. FCIQMC 

We seek to find the ground state wavefunction and 
energy of the iV-electron HEG in a simulation cell with 
periodic boundary conditions in the plane- wave represen- 
tation. The single particle states are given by, 



V>j(x) = i>j(T,a) 



(i) 



specified by a set of reciprocal lattice vectors {kj}, where 
f2 is the real-space unit cell volume of a cubic cell. Impos- 
ing a cubic symmetry to the simulation cell allows k to 
take values of =^ (n,m 7 l) where n,m and I arc integers. 
We then use a single cutoff parameter, k c , to confine our 
basis set to be the M spin orbitals resulting from those 
plane waves of a kinetic energy less than \k 2 c . 

The simulation-cell HEG Hamiltonian can be written 
using second quantization as: 



H 



ij 



2 ^ 

ijkl 



"MO>„ (2) 



where i, j, k and I refer to single-particle plane waves. 
The t\ matrix elements are due to the kinetic energy 
operator, 
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which is diagonal in the plane wave representation. 
The two-particle operator, containing electron-electron 
interactions, electron-background interactions and the 
background-background interaction, is represented by, 



,.ab 



-k^g.kj-kt, 
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which are antisymmetrized products of N normalized 
spin-orbitals, 



Di = A ['0i(xi)'0j(x2)...'0fc(xjv)] 



(7) 



All determinants formed from the rearrangement of the 
N electrons in the 2M spin orbitals are included in the 
sum over i, which uniquely labels each determinant^!. In 
the FCI approach to this problem, the coefficients are 
found by diagonalization of the Hamiltonian matrix. 

In a recently developed quantum Monte Carlo al- 
gorithm, termed Full Configuration Interaction QMC 
(FCIQMC)ii, the ground state wavefunction and energy 
are found by a long-time integration, 
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This can be re-cast in terms of the Hamiltonian matrix as 
a set of coupled equations for the determinant coefficients 
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where -Ehf is the Hartree-Fock energy and an arbitrary 
energy 'shift', S, has been introduced. These equations 
are then regarded as a set of master equations governing 
the dynamics of the evolution of the determinant coeffi- 
cients in imaginary time, with elements of H being non- 
unitary transition rates. The sign problem in this form of 
quantum Monte Carlo is generally ameliorated compared 
with that of diffusion Monte Carlo 2 ^. 

These dynamics are simulated by introducing a popu- 
lation of N w 'walkers', which, when distributed over the 
determinants, represent the sign of the coefficients in the 
FCI expansion for the purposes of the simulation, 
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where each walker can have a positive or negative 
sign. The walker population is then allowed to evolve 
through discrctized imaginary time-steps by spawning, 
death/cloning and annihilation events according to Eq.[S] 
until a steady-state is reached. 

The exact rules for this can be found in Ref. [Ill, but 
are described briefly here: 

1. In the spawning step, each walker is considered in 
turn. A connected determinant D } is chosen with 
a normalized probability p gon (j|i), and an attempt 
is made to spawn onto this determinant with prob- 
ability Ar ^. i i i .l . Attempts are generally restricted 
to coupled determinants, defined by i?y being non- 
zero, for efficiency. If this value exceeds 1, the num- 
ber of walkers spawned is related to the amount by 
which this value exceeds 1. The sign is determined 
as the same as the parent if i?y < and the oppo- 
site sign otherwise. 

2. In the death/cloning step each walker attempts to 
die or clone itself with probability St (Ha — £hf — S) 
where the walker dies if this is positive and is cloned 
if this is negative. 

3. Finally, in the annihilation step each walker is con- 
sidered and removed if there is an opposite-signed 
walker at the same determinant. 

The simulation has two phases: 

1. Fixed shift mode. In this period of the calculation, 
the shift (S) is fixed at a constant value. This 
should result in an exponential growth of walkers 
as long as S is greater than the correlation energy, 
whose rate depends on the value of this shift, the 
timcstcp and the correlation energy of the problem. 
Increasing the value that the shift is fixed at rel- 
ative to the correlation energy will result in faster 
growth, however it has been observed in some cases 
that this can result in longer equilibration times 
once the shift is allowed to vary. 

2. Variable shift mode. When a target walker number 
has been reached the simulation proceeds to vary 
S to keep the walker number N w constant. After 
an equilibration period, for high enough N w , the 
determinant populations equilibrate to a distribu- 
tion proportionate to the FCI wavefunction. The 
parameter S therefore is a population control pa- 
rameter. 

The energy can be found in two ways from the sim- 
ulation. In variable shift mode, S is updated self- 
consistently at equilibrium and oscillates around the cor- 
relation energy as expected from Eq. [9] 

However, throughout this work, the projected energy 
is used as an energy estimator for the dynamic, 

£fciqmc = lim £ ( Di \H\D )^f^-, (11) 



where Do is taken as the Hartrce-Fock determinant and 
j is taken as a sum over doubly-excited determinants. 

Typically the walker population is initially grown by 
setting S equal to zero, from one walker on the HF de- 
terminant. Only populations above a critical system- 
dependent size are able to converge to the FCI distribu- 
tion, and this size was found to scale linearly with the size 
of the Hilbert spaced. Nevertheless, small prcfactors to 
this scaling allowed the method to be used to achieve FCI 
accuracy on a range of systems which were previously out 
of reach of traditional diagonalization algorithms^. 

However, in order to alleviate this scaling problem, an 
adaptation of this method has been developed, called 
initiator-FCIQMC (i-FCIQMC)^^^. The determi- 
nant space is instantaneously divided into those deter- 
minants exceeding a population of n a( jd walkers, termed 
initiator determinants, and those that do not. When con- 
sidering a determinant whose current population is zero, 
the sum in the second term of Eq. [9j the term describing 
net flux of walkers onto that determinant, is taken to be 
only over initiator determinants. 

i-FCIQMC has been shown to dramatically accel- 
erate the convergence of FCIQMC with respect to 
walker number. In the large walker number limit, the 
i-FCIQMC tends to the FCIQMC algorithm, which it- 
self converges rigorously to the FCI energy. In previ- 
ous work, simulations with different walker numbers were 
performed to explicitly demonstrate convergence towards 
this limit by finding correlation energies over an increas- 
ing range of walker numbers^. In the present work we 
will show that this limit can be rigorously found from a 
single calculation. 



A. Previous Work on the HEG 

In a previous study-i&, i-FCIQMC was applied to the 
54-electron HEG at r s =0.5 and 1.0 a.u. to find energies 
for a range of N w and M. The N w — > oo limit was found 
by direct evaluation using separate converged runs at dif- 
ferent N w values, and the M — > oo limit was found by us- 
ing a 1/M extrapolation. Finally, the resultant complete 
basis set exact energy compared favorably with diffusion 
Monte Carlo results (DMC)^. 

In these simulations, the walker number was grown in 
fixed shift mode under a set of parameters S, St and n a dd 
before being released into variable shift mode at a certain 
N w . After being allowed to reach equilibrium, the finite 
walker i-FCIQMC energy (E (N w )) was found from an 
imaginary time average of the projected energy (Eq. Ill[) 
which does not depend on S, since it is collected after 
equilibration in variable shift mode. 

The form of the function E (N w ) is however dependent 
on the two parameters St and n a dd> which can be mod- 
ified and optimized for efficiency. E (N w ) is thought to 
vary the most with n a dd, which must be kept at the same 
value for a set of simulations. The i-FCIQMC scheme 
tends towards the original FCIQMC method in the limit 
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of "-add == or N w — >■ oo. A typical n ac jd chosen is three, 
and we will analyze this choice later. In contrast, E (N w ) 
is somewhat insensitive to the timestep St, which is set to 
avoid too many spawning events causing unoccupied de- 
terminants to immediately form initiator determinants, 
since this can cause slow convergence. 

In the HEG, the plurality of matrix elements with the 
same magnitude means that slow convergence is observed 
if any spawning events lead to immediate initiator for- 
mation, and as such, St is defined to be within the range, 



a single simulation can be written, 



St < 



Pgcn (j|i)n a dd 
\H»\ ' 



(12) 



This limit is analytically computable for a given N elec- 
trons and M spin orbitals since generation probabilities 
in this case are uniform: 



Pgcn 



N(N-1)M-N 



x ttL. 



(13) 



A St of approximately 90% this maximum allowed value 
is used to maintain high acceptance ratios. 

The finite walker i-FCIQMC energy, E (N w ), obtained 
from a simulation has associated with it a systematic er- 
ror due to the initiator approximation, which is rigor- 
ously removed in the limit of N w — » oo when E (N w ) —¥ 
-Efciqmc- The difference between E (N w ) and this limit 
is termed initiator error. 



III. ANALYSIS OF 7-FCIQMC AND A FIXED 
SHIFT STRATEGY FOR RAPID 
CONVERGENCE OF INITIATOR ERROR 

A. Division of initiator error and stochastic error 

We now describe a technique that allows us to better 
resolve the sources of error in the calculation and separate 
out stochastic error and initiator error. In so doing we 
will also place the technical observations of the previous 
study, described in Sec. Ill A[ on a more rigorous footing. 

In an i-FCIQMC run in fixed shift mode, as long as 
S is higher than the correlation energy, the walker num- 
ber grows exponentially, approximately as e T ^ s ~ Ecorr ^ . In 
this mode, the instantaneous projected energy (Eq. Ill[) 
will tend towards an increasingly stationary value in the 
large- N w limit, and settle onto the correlation energy for 
the problem. The ground state contribution to the wave- 
function should always grow faster than any other excited 
state, and so regardless of the specific value of the shift 
parameter (which can be considered an energy offset in 
the Hamiltonian matrix) the ground-state should be re- 
covered in the long-r limit. 

In a simulation where there is steady exponential 
growth, there is a one-to-one relationship between val- 
ues of t and N w . The instantaneous projected energy for 



E tA (t) = lim Y^{D- } \H\D ) 



N (t) ' 
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and is therefore a well-behaved function of N w . We 
assert that E T ^ (t) written in terms of N w is there- 
fore an approximation to the finite walker number 
i-FCIQMC energy E{N W ). This is only rigorously true 
in the absence of any need for equilibration, when the 
simulation could be released into variable shift mode 
without a change in the average projected energy. This 
will be true as long as the growth in walker number is 
quasi-adiabatic. This estimate can be a poor representa- 
tion of E (N w ), since there is a large amount of stochas- 
tic error in each point. As with a normal simulation in 
variable shift mode each point is now serially correlated 
with the one before it. However, unlike in variable shift 
mode, the correlation time cannot be assumed to be con- 
stant because walker number is increasing. Therefore the 
blocking analysis due to Flyvbjerg and Petersen^, which 
is used to extract the correlation time, is no longer ap- 
propriate and we must investigate another method for 
estimating and minimizing the stochastic error. 

Assuming points along E (N w ) are unaffected by the 
starting point of the simulation, a straightforward way 
of finding the stochastic error would be to use several 
independent calculations, with different random number 
seeds. For N r seeds we can compute the instantaneous 
average, 

j Nr 

E(N W )~E T (N W ) = -J2 E ^( T ( N ™V ( 15 ) 

i 

where we have now used r (N w ) to indicate that r is a 
function of N w . The stochastic error can be estimated 
from, 



1 



E 



[Ei(r(N w )f 

N r 



- (E T (N W )Y 



(16) 

The specifics of each walker growth profile will mean that 
due to statistical fluctuations, identical values of N w can 
not be assumed for each simulation. Therefore, averages 
are taken in intervals for the closest values to a chosen 
set of iVuj-values. 

Figure [TJ shows a simulation using this method, illus- 
trating a way to approximate E (N w ) without needing to 
obtain equilibrated energies from separate calculations 
at a set of N w values. To indicate that our estimate of 

the stochastic error is unbiased, it is also shown that the 

i 

stochastic standard deviation, e s N r 2 , is preserved when 
the number of seeds is changed for a wide range of N w 
(Fig. Ilb[) . This verifies that the stochastic error falls off 

as N r 2 which is consistent with a good, uncorrelatcd 
error estimator. Also shown in this plot is the relation- 
ship between N w and the stochastic standard deviation, 
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(a) Estimate of E (N w ) found by averaging E T i (r (N w )) over (b) Stochastic standard deviations for different numbers of seeds 
120 seeds 



FIG. 1: An example i-FCIQMC simulation performed on the N = 14, M — 186, r s = 1.0 a.u. electron gas with 
n ac jd = 3. In the modified approach proposed here, fixed shift mode is used throughout the simulation (in this case 

S = O.lEh). The stochastic error is found by averaging over different pseudorandom number sequences, started from 
N r different seeds. This is assumed to be free from serial correlation. In the high N w limit, the FCI energy for the 

problem is recovered in common with the same limit in variable shift mode. In plot (b), the approach of finding the 

error from different seeds is justified. The stochastic standard deviation for various N r agree, implying the standard 

_ i 

error decays as N r 2 . 




Simulation time / corehours 



FIG. 2: Simulation time for N = 14, M = 186, 
r s = 1.0 a.u. and n a( jd = 3 as a function of walker 
number. The formal scaling cost is O [N w Log(N w )], but 
the scaling in the high walker number limit tends 
towards approximately O [N w ] (shown by dashed, red 
line) . The logarithmic scale is used to show the range of 
the scaling relationship. 



and we consider the trend to be of the form N w a where a 
takes approximate values over the range shown in Fig. Ilbl 
from 0.35 < a < 0.47. 

The trend observed in stochastic error is important 
to understand when discussing computational costs and 
scalings of the method. There are two ways in which 



the simulation can be modified to reduce the stochas- 
tic error. Either the the number of parallel runs can 

be increased or the number of walkers can be increased, 

_ i 

resulting in polynomial N r 2 and N~ a decay of error re- 
spectively. For a fixed-shift calculation, the simulation 
cost increase of walker growth is ~ O [N w ] in the high 
N w limit (Fig. [5]), with the memory cost also being lin- 
ear in N w . The cost in terms of memory and runtime of 
more N r also scales linearly. As separate copies of the 
program are run simultaneously, the parallelization over 
N r is perfectly linear. It is therefore more cost-effective 
to reduce the stochastic error for systems for a < 0.5 
using more parallel runs. 

However, increasing the number of walkers also de- 
creases the initiator error. In practice, the rapidity with 
which initiator error decays still means that the N, w — > oo 
limit can be found by running the simulation at higher 
walker numbers until the energy does not change sig- 
nificantly with N w . Thereafter, provided a < 0.5, the 
stochastic error can be removed by using more random 
seeds. In the limit of converging onto the FCI wavefunc- 
tion, a = 0.5 due to increasingly fine discretization of 
5', and these two methods of decreasing stochastic error 
become equivalent. 

B. The role of the shift parameter, S 

We now wish to compare fixed shift and variable shift 
calculations for efficiency, and as such we now frame our 
discussion in terms of whether E (N w ) is best derived 
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(b) j-FCIQMC stochastic errors on E T ,s (N W ,S) for different 5 
values 

FIG. 3: i-FCIQMC runs at different values of the shift 
parameter. S, all performed using TV = 54, M = 186, 
r s = 1.0, n ac id = 3 and N r = 8. These estimates of 
E (N w ) tend towards the independent variable shift 
i-FCIQMC calculations as S tends towards E corr (panel 
a). The relaxation towards this limit is observed to be 
exponentially fast in A = S — E COXT . In the low A limit, 
the stochastic error does not change significantly 
between different values of S (panel b). 



FIG. 4: i-FCIQMC runs in fixed-shift mode at different 

values of the shift parameter, S, all performed using 
N = 54, M = 186, r s = 1.0, n add = 3 and N r = 1. The 
rate of growth of walkers per corehour was measured 
from the linear, high N w limit (Fig. Error bars were 
found on the fit on the order of 0.01%. The speed of 

growth in the low S limit grows linearly in 
A = S — -Ecorr from theoretical zero-rate growth at 
S = -Ecorr ( re d dashed line). The next leading order 
term appears to be exponential, but this might be due 
to lack of convergence for the high-shift values (Fig. [3J). 
Nonetheless, linear growth rates are demonstrated for 
high N w . 



from fixed-shift or variable-shift calculations. A crucial 
difference that this introduces is that E (N w ) estimated 
from a fixed-shift calculation is dependent on the S that 
the simulation is fixed at. As such we will denote this 
E Ty s (N W ,S), to make clear this dependence the energy 
now has on the choice of fixed shift. 

Comparison between E(N W ) and E Tt s (N w , S) for a 
variety of shift values shows empirically that as S is re- 
duced towards the correlation energy the estimate gets 
closer to the variable shift estimate for that given walker 
number. This is because, as S is reduced in fixed shift 
mode, growth of the population is slower and a greater 
amount of equilibration can occur between each increase 
in walker number. In the limit that S = E COTT , the popu- 
lation never grows and should be able to equilibrate per- 
fectly, and therefore become equivalent to variable shift 
mode in terms of the quality of the wavefunction gener- 
ated at a given walker number. We can therefore make 
the equivalence E (N w ) = E Tt s (N w , S = E COTI ). 

When choosing the shift for a simulation, there is a 
trade-off between lowering the shift, so that the conver- 
gence to the large N w energy is faster, and the runtime 
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0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 

A / E h 

FIG. 5: i-FCIQMC runs in fixed-shift mode, at different 

values of the shift parameter, S, all performed using 
N = 54, M = 186, r s = 1.0, n add = 3 and N r = 1. The 
relative cost of converging to the large walker limit for a 
given shift value is estimated from a combination of the 
cost of growth from Fig. [4] (dashed blue line), and the 
higher N w needed to converge a given initiator error 
seen in Fig. [3] (which is approximately exponential in A 

for small A). As such, there is a broad minimum in 
compound cost for a range of A, which we expect to be 
highly system-dependent. 



penalty this incurs from slow walker growth. The time 
taken to reach a certain walker number is found to scale 
linearly with 1/A, A = S - E COII , for low A (Fig. gj}. 
In contrast, the penalty for having too high S is expo- 
nential in A (Fig. [3} . There is a minimum in cost as S 
is increased at the cross-over between these two scaling 
relationships (Fig. [5]). 

The scaling of the growth rate for low A (Fig. g]), can 
be expressed as: 

lim — cx A (17) 

N w ^oo T 

where T is the total simulation time. This is directly 
related to the high-JV^, walker growth, which proceeds 
as, 

N w = e? A , (18) 

where /3 is a system-dependent constant, proportional to 
the total elapsed imaginary time. The simulation time is 
instantaneously proportional to N w , so total simulation 



time can be written, 

rP 

T = J N w {?) dp, 

= j\^dfi', (19) 

A ' 

Therefore, in the high N w limit Hff- cx A as required to 
yield a computational cost scaling as 1/A. 

C. The initiator threshold parameter, n a dd 

The initiator threshold parameter n a dd determines the 
number of walkers above which an occupied site is consid- 
ered an initiator and as such is an important parameter 
in i-FCIQMC. In the limit of N w — > oo all determinants 
become occupied and the algorithm returns to the origi- 
nal FCIQMC algorithm. The FCIQMC algorithm is also 
recovered in the limit of n a dd = but this negates the 
computational advantages of the initiator approximation. 

In principal, each value of n a dd should yield a different 
form of E (N w ) since n a dd alters the effects of the dy- 
namics in a non-trivial way. However, as Fig. [5] shows, a 
much simpler relationship is observed. Although the re- 
lationship between energy and walker number is different 
for each value of n a dd, this merely seems to rescale N w 
linearly. 

In this way n a dd can be seen to behave as a resolu- 
tion parameter. Imagine comparing two simulations with 
{N WtA , n a dd,A} and {N W>B , n a dd,B}- If N W>B = 2N WtA 
and n a dd,B = 2n a dd,A, and assuming that there was a 
one-to-one mapping between determinant populations, 
Nia 2JVj Bi the energy estimate at all paired val- 
ues of N* j = N w /n a( id would be the same. This is 
demonstrated schematically in Fig. [7] Although more 
walkers would normally lead to less stochastic error, the 
on-site and between-site flux would be rescaled with n a dd 
(Fig. I6b[) . This analysis and explanation should only hold 
in the high n a dd and N w limit, and it just so happens that 
?^add = 2 is high enough to display this behavior for the 
54-electron r s = 1.0 a.u. HEG. 

Finally, in these simulations, St was set according to 
Eq. [12J However, the form of E T (N w ) is independent 
of St for these systems provided the inequality Eq. [T^] is 
met. Since the run-time of a simulation is proportional 
to St, this is simply maximized. 

D. Summary 

To summarize, in this section we have looked 
at a number of the simulation parameters for an 
i-FCIQMC calculation within the context of the problem 
posed by the HEG, and generalized where possible. From 
consideration of isolated 'initiator' error arising from the 
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(b) Stochastic errors on E T s {N W ,S) for different n ac id values 

FIG. 6: z-FCIQMC runs at different values of the n ac jd 
parameter (N = 54, M = 186, r s = 1.0, N r = 8). To 

illustrate the apparent effect of changing n a dd , the N w 
axis has been rescaled by dividing through by n a( jd, 
causing the lines to be overlaid. 
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FIG. 7: Schematic diagram demonstrating how n a dd 
acts as a resolution parameter. In moving between the 
top and the bottom diagrams, ?i a dd has been doubled, 
but N w has also been doubled. This has no effect to the 
energy estimate, unless the ability to resolve low-weight 
determinants is important. In the high N w and n a dd 
limit this is unlikely to be the case. 



schemes where separate converged calculations were per- 
formed for a variety of walker numbers. The cost of 
this strategy was critically analyzed in terms of speed 
of walker growth and decay of random errors, providing 
optimal values for the fixed shift, and initiator thresh- 
old parameter, n a dd- In cases where this initiator error 
is challenging to converge, this strategy is presented as 
an appealing alternative. We mean this both in terms of 
effectively directing computational effort to ameliorate 
initiator and stochastic error, as well as provision of in- 
sight into the dynamics of the FCIQMC simulation in the 
space. 



finite number of walkers sampling the space, a new strat- 
egy was detailed whereby the calculation remains in fixed 
shift mode to reach high walker limits, while obviating 
the need for lengthy equilibration times in variable shift 
mode. These limits are required to demonstrate an ef- 
fective elimination of this initiator error. This new sim- 
ulation method was shown to be equivalent to previous 



IV. APPLICATION TO THE HEG 

We now present an application of FCIQMC to the elec- 
tron gas with the aim of producing results for the finite- 
basis 14-electron gas, and then also in the complete ba- 
sis set limit with N = 14, 38 and 54 electrons. Buoyed 
by various techniques to ameliorate the high scaling of 
quantum chemical methods, such as explicitly correlated 
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basis sets*£~—, local approximations^ 7 -"— and other s 30 ' 31 , 
quantum chemists are beginning to tackle the problems 
presented in the solid-state. However, these efforts have 
generally excluded an examination of the HEG. In pro- 
ducing high-accuracy literature benchmarks for the 14- 
electron gas, we hope that this will encourage the com- 
parison of other techniques intended for application to 
solid state systems in the growing community looking to 
use quantum chemical techniques 3 ^— . 

We now introduce an approximate extrapolation tech- 
nique to efficiently calculate complete basis set limit es- 
timates, with it in mind that these results can then be 
compared with DMC calculations. DMC calculations 
have been extremely successful in treating the HEG, 
with the most famous study being that of Ceperley and 
Alder—. Recent studies have tended to use the fixed- 
node approximation which intrinsically has an error as- 
sociated with it, but that error is both thought to be 
small and unquantifiable^. These results are widely re- 
garded as the best estimates of energies in the HEG over 
a range of densities. DMC has allowed for a large range 
of properties of the electron gas to be calculated, includ- 
ing phase diagram s 17 ' 39 , the effective mass 7 -^ -, the reno- 
malization factor—, spectral moments 4 ^ and the momen- 
tum distribution^ ' 43 ' 44 . Unfortunately, since to the best 
knowledge of the authors no low- TV simulations exist in 
the literature for DMC, comparison between FCIQMC 
energies and DMC are left largely as an open question. 
It is intended that these results ultimately be used to 
compare between the initiator error of FCIQMC and the 
fixed-node error remaining in modern backflow DMC re- 
sults. 



A. Basis set incompleteness error 

Having dealt with the internal parameters that are im- 
portant to the FCIQMC method, we now discuss the 
remaining parameter M, the size of our underlying one- 
electron basis, equal to twice the number of plane waves 
enclosed by a sphere centered at the origin of reciprocal 
space of radius k c . Using this single parameter, the com- 
plete basis set (CBS) limit can be found by taking the 
limit of the correlation energy as k c — > oo. 

The difference between the energy retrieved by a quan- 
tum chemical method in a finite basis, and the theoretical 
limit of this energy reached in a complete basis is termed 
basis set incompleteness error. Although a method may 
be a good approximation to exact results in principle, 
chemical accuracy is typically only achievable in the CBS 
limit. As such, there is much literature for how this limit 
is approached for molecular basis sets, and it has been 
shown both analytically and numerically that this limit 
is approached in X , the cardinal number of the basis set, 
as 1/A 3 (Ref. [H). 

In a separate paper yet to be published by the 
authors 4 ^, the CBS limit is shown to be approached 
as 1/M for plane- wave wavefunction methods. This is 
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FIG. 8: i-FCIQMC results for N = 14, r s = 1.0 showing 
a 1/M convergence to the CBS limit. These results 
were found to be converged with respect to initiator 
error at N. w ~ 10 7 — 10 8 . Stochastic error bars are 
plotted but generally too small to be seen. 
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FIG. 9: Graph comparing initiator and stochastic error 
for N = 54, M = 186, r s = 1.0 a.u., S = -0.1 E h and 
iV r = 8. The full curve of energies is shown in Fig. [3J 



in agreement with the corresponding trend when using 
Gaussian expansions since the number of spin orbitals 
used scales as X 3 . Figure [5] shows using this approach 
to obtain CBS limit energies for the r s = 1.0 a.u. 14- 
clcctron gas. 
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□ □ M = 186 ■ ■ M = 1850 



— U.JU -, , 4 c 

10 10 10 10 

A^HF 

(a) Initiator error does not change significantly with increase in 

basis set size in the high M limit for this system, since all 
curves are simply a shift in energy from each other. Error bars 
are only shown for M = 114 for clarity and dotted lines show 
asymptotic limits. 




(b) The stochastic error as a function of a given population on 
the HF determinant (TVhf) does not change significantly with 
increase in basis set size in the high M limit for this system. 

FIG. 10: Initiator and stochastic error for 
i-FCIQMC runs at different basis set sizes, N = 14, 
r s = 1.0, N r = 8 and S = 0.1. 



B. Basis set scaling 

An important aspect of any new method is the com- 
putational scaling of this method with parameters of the 
systems that are being studied. One such parameter of 
interest is basis set size, M. Since FCIQMC is a relatively 
new method, very little scaling work has been considered. 
Here, we present an initial analysis of basis set scaling as 
applied to the weakly correlated N — 14, r s = 1.0 a.u. 



electron gas. 

Since FCIQMC is a stochastic method, the factors 
affecting the time required for the simulation can be 
crudely broken down into three considerations: 

1. The number of walkers required to converge the 
calculation. This is primarily to eliminate initiator 
error, but number of walkers also aids in conver- 
gence of stochastic error. Initiator error estimates 
are very difficult to quantify due to, in principle, 
the N w — > oo limit needing to be reached for com- 
parison. Moreover, since the CBS estimate comes 
with a stochastic error, as do other points along the 
initiator error graph, the initiator error becomes 
rapidly lost in stochastic noise (Fig. [5]). However, 
initiator error seems to decay very rapidly for sys- 
tems studied and graphs of E (N w ) with character- 
istic decays can be compared. 

2. The simulation time taken to grow the number of 
walkers. Although this is 0[N w Log(N w )] in the 
current implementation, the difference between this 
and a linear scaling is very small in the high walker 
limit. Moreover, the algorithm could be theoreti- 
cally optimized for linear-scaling growth at all N w 
but this is thought to have a more costly prefac- 
tor. As such, this will be generally discussed as 
linear-scaling in N w . 

3. The simulation time taken to reduce the stochastic 
error. This is best achieved from the point of view 
of the present work by increasing the number of 
seeds for the simulation N r . 

The number of walkers on the Hartree-Fock determi- 
nant required to converge the calculation, from the point 
of view of initiator error, is startlingly invariant with re- 
spect to M for the system studied here (Fig. llOa]) . Initia- 
tor error plots appear essentially identical in the high M 
limit. In addition, the stochastic error decays at the same 
rate for each M with respect to the population on the 
Hartree-Fock determinant (Ahf), as shown in Fig. UObl 
However, the simulation time taken to reach the number 
of walkers required, however, increases as O [M] because 
the connectivity of the space grows as this factor, and 
therefore St is reduced to maintain the same quality of 
sampling. This factor seems to be the leading order scal- 
ing in M. 

Although ./Vhf would be expected to grow proportion- 
ally to N w at convergence, it only grows as (N w ) 7 for 
7 < 1 for typical HEG simulations. This can be seen 
from the ratio of the population on the Hartree-Fock de- 
terminant to the total walker population, which should 
be constant for 7 = 1 (Fig. Illa|) . This trend actually 
predicts 7 ~ 0.76 since a number of walkers at any time 
reside on low amplitude determinants, which are gener- 
ally very large in number, pushing N-hf/N w down. Al- 
though this would indicate a lack of convergence, the 
ratio of Ahf to the population at the double excited 
determinants, -/Vdoubs, is approximately constant within 
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(a) Change in fraction of the population at the Hartree-Fock 
determinant with walker population for a variety of basis set 
sizes 



(b) The ratio of the population at the Hartree-Fock 
determinant to the total population of doubly excited 
determinants changes with basis set size but not with N w at 



convergence. 



FIG. 11: Trends in the i-FCIQMC wavefunction for N = 14, r s = 1.0 a.u. 



at N,, 




162 



FIG. 12: When the initiator error is converged, for the 

N = 14, r s = 1.0 a.u., the fraction of the walker 
population that is at the Hartree-Fock determinant falls 
as 1/M for large basis set sizes 



stochastic fluctuations when a calculation has converged 
(Fig. lllb)) . These are the only contributing determinants 
to the projected energy, the energy estimate used in this 
study, and so it makes sense that when there is a station- 
ary distribution of walkers across these determinants, the 
simulation would be converged. 

Returning to the question of scaling with M, the frac- 
tion ^p 1 is shown in Fig. Q21 and generally behaves as 
1/M, for high walker numbers. The total number of 
walkers taken to roach a given 'target' population at the- 



HF determinant is therefore N„ 



M 



where A and 



B are constants. The leading order contribution to this 
in the high M limit is O [1]. This, then, should be mul- 



tiplied by the cost of walker growth, O [M] due to the 
required decrease in 5t, to give O [M], or linear scaling, 
overall. This linear scaling is only physically realistic if 
we consider that the basis functions in the high-energy 
parts of the space are completely decoupled from one an- 
other. This could well be very reasonable for the high M 
limit of a weakly correlated system. 

The final consideration to make is to comment that we 
have only observed this behavior for the relatively small 
system of N = 14, and the relatively weakly-correlated 
r s = 1.0 a.u. How transferrable are these findings? It 
is probable that the observation of a constant initiator 
error with M will not hold for larger, or more correlated, 
systems. Indeed it has already been shown that for the 
N = 54 electron gas that there is a significant change of 
initiator error with basis set sizei£, although it is likely 
that the high basis set limit is reached much more quickly 
for N = 14. Notwithstanding this, it is hard to think 
that computational effort would scale in any way other 
than exponentially in M, since the size of the space to 
be sampled is growing as O [Ml], but that the pref actor 
might be low enough that this is never observed within 
the desired random error. 

In spite of this somewhat surprising scaling relation, 
the 14 electron problem is not a trivial one for quantum 
chemical methods and upon entering the linear scaling 
regieme, the remaining basis set incompleteness error is 
still of the order ^0.01 Eh (Fig JlOa]) . As such the finding 
that the high M limit of this system can be captured as 
O [M] is important. The apparent lack of growth of ini- 
tiator error on increasing M shows that the sparsity with 
which the space to be sampled docs not grow significantly 
with M, which may well apply to other systems. 
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FIG. 13: Graphs of correlation energy retrieved with 
respect to walker number with different r s , M = 186. 
The N w —> oo limit corresponds to the correlation 
energy for each of the systems. Convergence to this 
limit is slower in N w for larger r s -values. 



C. Comparison of electron densities (r a ) 







r s ( 


a.u.) 




M 


0.5 


1.0 


2.0 


5.0 


114 


-0.5169(1) 


-0.46111(9) 


-0.3842(2) 


-0.2645(3) 


186 


-0.5589(1) 


-0.50093(9) 


-0.4207(3) 


-0.2928(4) 


358 


-0.5797(3) 


-0.5189(1) 


-0.4355(4) 


-0.3017(7) 


778 


-0.5893(3) 


-0.5265(2) 


-0.4410(5) 


-0.304(1) 


1850 


-0.5936(3) 


-0.5294(3) 


-0.4431(5) 




2368 


-0.5939(4) 


-0.5305(5) 


-0.4430(7) 




oo 


-0.5969(3) 


-0.5325(4) 


-0.4447(4) 


-0.306(1) 



TABLE I: i-FCIQMC correlation energies for TV = 14. 
The number in brackets corresponds to the stochastic- 
error in the preceeding digit. The M = oo result is 
based on extrapolations shown in Fig. [8] from which its 
error estimate derives. 
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FIG. 14: Basis set incompleteness error for 
0.5 < r s < 5.0 a.u. for a variety of basis set sizes. 
Convergence is of 1 /M form (dotted black lines) in the 

limit of 71/ — 5- oo, and the rate of convergence to this 
limit does not appear to change as r s is raised. This is 

partly because the magnitude of the basis set 
incompleteness error decreases with increasing r s , and 
therefore an estimate of the complete basis set result is 
less sensitive to the precise form and behavior of the 
extrapolation. 



ergy retrieved against walker number in Fig. [13] Whilst 
the r s = 0.5 calculation for the basis set shown is con- 
verged at 10 4 walkers, the r s = 5.0 calculation takes 10 9 
walkers to converge. We can attribute this to the lowering 
of the sparsity of this representation of the wavcfunction 
due to stronger correlation effects at larger r s . We an- 
ticipate that the r s parameter would behave similarly in 
a conventional FCIQMC calculation to the Hubbard U 
parameter, whose effect on the sign problem in FCIQMC 
has been analysed in detail^. The extrapolation to the 
complete basis set limit for these densities is shown in 
Fig. [T4J and indicate that the onset of the 1/M scaling 
regime is relatively insensitive to this density. This is 
also demonstrated later in Fig. [TTJ 



FCIQMC energies obtained for r s = 0.5 — 5.0 a.u. are 
given in Table U which we present as new small-system 
benchmarks. To our knowledge, systems of this small an 
electron number but with nonetheless vast Hilbert spaces, 
have not been studied to date and as such we have no 
values to compare to. In a previous study, however, we 
showed that FCIQMC results are highly competitive with 
DMC results in the complete basis limit 1 ^. 

As r s is raised, the difficulty of the problem for 
i-FCIQMC rises sharply, which we can see from the en- 



D. Using a single point extrapolation of the 
projected energy to achieve complete basis set 
estimates 

In a separate study of basis set convergence in plane 
wave wavefunction methods by the authors, yet to be 
published, it was shown that it is possible to use a sin- 
gle large basis set calculation to yield an estimate of the 
CBS correlation energy for the HEG. This is achieved 



13 



>. 

Sh 

a; 

3 
O 



O 

O 



-0.48 
-0.50 
-0.52 
-0.54 
-0.56 
-0.58 



• M = 1850 single point 'extrapolation 
■ £ j-FCIQMC X 




358" 1 114 -1 , 54" 
M or M 



-l 



FIG. 15: i-FCIQMC results for AT = 14, r s = 0.5 
showing a 1/M or 1/M' convergence to the CBS limit 
for two schemes. In the conventional scheme, 
calculations are run at different basis set sizes and 
extrapolated to the CBS limit (Sec. HVXf . However, it 
is also possible to find a CBS estimate from single point 
extrapolation (solid blue line) of the projected energy, 
where the points along this line derive from a single 
calculation at an overall basis set size M=1850 (see 
text). As such, they share a single stochastic error bar 
(dotted blue line). These results were found to be 
converged with respect to initiator error at 
N,„ ~ 10 6 - 10 7 . 



-0.55 
-0.56 
-0.57 
-0.58 



g -0.59 
-0.60 



o 

° -0.61 



-0.62 
-0.63 

CO 



^j-FCIQMC 

Single point extrapolation 
Direct extrapolation 
CBS estimate 



--fv. 



1 778" 1 358" 1 186" 1 
M _1 (direct and SPE) 



by dividing the contributions to the energy from a sin- 
gle large basis set calculation into regions of momentum 
space, producing smaller, effective basis sets from which 
the CBS limit can be estimated by extrapolation. 

Starting from the formulation of the FCIQMC corre- 
lation energy that we are using, the projected energy 
(defined in Eq. [IT]), 

£proj=ECDj|#|A)>^, (20) 
j 

where j refers to double excitations of the Hartree-Fock 
determinant. We can divide this into individual contri- 
butions from sets of four Appoints, which uniquely define 
the four one-electron states for each double excitation 
ij — > ab, where ij are occupied in the Hartree-Fock de- 
terminant and ab are unoccupied, 

occ virt 

Eco „ = j2J2^- (21) 

ij ab 

with, 

c k a k(, 

X^t = (ij\\ab}^, (22) 

Recalling that this set of /c-points is bounded by a max- 
imum kinetic energy, and thus has a basis set size (M) 



FIG. 16: Comparison between results obtained by 
conventional i-FCIQMC calculations with CBS 
estimates from either normal extrapolation, described in 
Sec. lIV Al or single point extrapolations (see text). At 
each value of M an i-FCIQMC simulation was run to 

yield an energy. This energy was then either 
extrapolated directly or a single-point extrapolation 
from this value of M was used to estimate the CBS 
limit using the masking function described in the text. 



associated with it as defined at the start of Sec. [TTJ The 
upper limit on the virtual space sum is therefore modi- 
fied, 

occ M 

iww = EE*k: k k ;- ( 23 ) 

ij ab 

We propose to regroup these energy contributions ac- 
cording to their arrangement in reciprocal space and use 
the behaviour of the coefficients to provide an estimate 
of the complete basis set limit. In doing so, we will con- 
struct energies of an effective basis sets of a smaller size, 
which can be viewed as groups of plane waves that lie on 
concentric spheres in reciprocal space, from a single large 
basis set calculation. Via extrapolation from all of these 
smaller effective basis set sizes, an estimate for the CBS 
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FIG. 17: Basis set incompleteness error for 
0.5 < r s < 5.0 for a variety of effective basis set sizes. In 
contrast to Fig. [TJ] the M' label here refers to an 
effective basis set size derived during single point 
extrapolation of the projected energy. Convergence is of 
1 /M' form (dotted black lines) in the limit of M' —> oo 
and this limit does not appear to be approached more 
slowly in M' as r s is raised, although the curvature is 
more pronounced at high r s . The basis set size M for 
these calculations is given in Table QTJ 



energy can then be obtained without the need for further 
calculations. 

In order to regroup these coefficients, we construct 
masking functions P, equal to either or 1, to effectively 
remove some of the terms in the sum in Eq. I23[ 



E r . 



occ M 

\ " \ ~" k a k 



J corr,efF 

(M) 

out of spherical step functions 



ij ab 



e (fc - k c ) 



i, |k| < k c 

0, otherwise 



(24) 



(25) 



such that these step functions, and hence the masking 
functions, have associated with them a kinetic energy 
cutoff analogous to the original energy cutoff for the cal- 
culation. These are then multiplied by the coefficients 
of the projected energy and re-summed at different new 
energy cutoffs that arc smaller than the original kinetic 
energy cutoff that the simulation was performed. 

In contrast to the original basis sets defined here, we 
construct these effective basis sets by using cutoffs based 
on the momentum transfer of each excitation. For ij — > 
ab, this is given by, 



or, equivalently, 

k a = kj - g' ; k b 



g , 



(27) 



due to conservation of momentum (kj + kj = k a + k;,). 
The masking function we have found to be most success- 
ful is, 



P g (g, g'; M g ) = &(g- g c ) + 6 (g' - g c ) 
-e(g-g c )Q (g' - g c ) , 



(28) 



g 



g- 



(26) 



which denotes the union of the set of k-points enclosed by 
two spheres of radius g c centred on the arguments of the 
function, g and g'. This leads to an expression for the 
energies of an effective basis set size M' , due to different 
effective cutoffs g c , 

occ M 

ij ab \ ) 

x P g (k t -k a ,kj : -k a ;A/'), 

This effective basis set size M' , denotes a truncated basis 
which encompasses twice the number of fc-points enclosed 
by a sphere of radius g c centred at the T-point. The 
behaviour of the energies due to these effective basis sets, 
in the limit that M is large enough to completely enclose 
all possible excitations of length g c , is also of form 1/M'. 
However, convergence on this linear behaviour is much 
faster and we are able to compute approximate CBS limit 
estimates from only one calculation. We therefore call 
this single point extrapolation. 

In Fig. 1161 we show an example of this technique being 
used to compute a CBS limit estimate from M = 1850. 
This agrees well with the CBS estimate from a normal 
extrapolation as in Sec. IIV A[ however this single point 
extrapolation scheme is found to converge at lower basis 
set sizes M (Fig. [To]) as well as providing a reliable ex- 
trapolation at each point. Errors in this technique arise 
from coefficient relaxation, due to the changing value of 
Xk^k 6 as M is varied. However, we find that there is 
cancellation of errors between the approximate effective 
basis energies and the resulting 1/M' gradient, and so 
convergence is rapid. 

It is worthwhile pointing out that this scheme is a 
marked contrast to the extrapolation scheme mentioned 
before in Sec. IIV Al where separate calculations, each vari- 
ationally the lowest energy achievable in a one-particle 
basis set, were used to extrapolate to the CBS limit. It 
is more common in the quantum chemical literature to 
take this previous approach and as such single point ex- 
trapolation goes against the prevailing literature. We ac- 
cept that these results will only be entirely accurate in 
the CBS limit of M — > oo, but note that this limit can 
be found systematically, and as such our results should 
be treated as CBS estimates. Furthermore, in the HEG, 
there is no orbital relaxation in the Hartrce-Fock orbitals 
as M increases, since they are exact. We nonetheless be- 
lieve this approach to be a reasonable one to take in plane 
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r s (a.u.) 


N 


-Ecorr (a.U.) 


N w 


M 


S (a.u.) 


»1add 


-Ecorr.DMC (a.U.) 


CPU time (corehours) 




14 


-0.5959(7) 


10 6 


1850 


0.0 


3 


_ 


200 


0.5 


38 


-1.849(1) 


10 8 


922 


0.1 


3 


_ 


4,000 




54 


-2.435(7) 


10 7 


922 


0.1 


3 


-2.387(2) 


4,000 




14 


-0.5316(4) 


10 7 


1850 


0.0 


3 


_ 


2,500 


1.0 


38 


-1.590(1) 


10 8 


922 


0.1 


3 


_ 


8,000 




54 


-2.124(3) 


10 s 


922 


0.1 


3 


-2.125(2) 


6,000 




14 


-0.444(1) 


10 7 


1850 


-0.2 


3 




2,500 


2.0 


38 


-1.225(2) 


10 9 


922 


0.1 


3 




16,000 


5.0 


14 


-0.307(1) 


10 10 


778 


-0.2 


3 




40,000 



TABLE II: z-FCIQMC complete basis set total correlation energies for a variety of N and r s , estimated using the 
projected energy single point extrapolation technique described in the text. The source of the error estimate is 
stochastic error. The results compare well with DMC results obtained by Rios et alJ£, which are comparable to 
those found by Kwon et alr^. For further discussion of this comparison, see Ref. [ill 



wave systems, where there is great flexibility in the ba- 
sis set size, slow basis set convergence, and in particular, 
is the most practical approach in terms of computational 
cost for the systems studied. Moreover, in real solid state 
systems, it is currently necessary to use an auxiliary basis 
set to find results at the complete basis set limits. 

In FCIQMC, the benefits of using a single point extrap- 
olation are substantial, due to the effects of the initiator 
approximation. The approximate form of the curve in 
Fig. [12] appears to converge very rapidly with respect to 
walker number, in particular allowing for an estimation of 
where the linear regime begins at very low computational 
cost. Furthermore, once a basis size has been chosen to 
perform a single point extrapolation from, the initiator 
error in the coefficients should be consistent for all ef- 
fective basis set sizes, somewhat mitigating errors due to 
change in initiator error across different basis sizes in the 
more traditional extrapolation scheme. 

It is also possible to calculate an estimate for the CBS 
limit on the fly during a simulation, when the region of 
linear 1 jM' behaviour is known, allowing for the produc- 
tion of an easily computable, rapidly convergent, stochas- 
tic correction. Furthermore, it is possible to probe the 
convergence in a variety of orbital subspaces using differ- 
ent masking functions, which may be useful in the future 
to help understand the nature of the initiator approxima- 
tion from the point of view of the one- particle basis set. 
Taken together, these extend the practical use of such a 
technique, however extensive study of this is beyond the 
scope of this paper. 

We therefore conclude by presenting a set of correlation 
energies in Table [TT1 We are confident that all 14 electron 
results are free from both initiator and basis set incom- 
pleteness error, although we cannot rule out this possibil- 
ity for the higher electron numbers. Convergence of the 
-E ff -cutoff extrapolation at a range of r s -values is shown 
in Fig. [17] and Table [TT1 M is assumed to be large enough 
for the single-point extrapolation to remove the basis set 
incompleteness error. This is supported by agreement 
(within stochastic error bars) with energies presented in 



Sec. lIV Al in which the CBS limit is achieved from a con- 
ventional l/M extrapolation, and with values published 
in a previous paper on the 54-electron problem. They 
also compare well with the most accurate DMC calcula- 
tions to dat o 18 ' 19 , in particular yielding a lower energy at 
N = 54 at r s = 0.5, again consistent with our previous 
paper (Ref. G3). We also note that computational cost 
of these new CBS z-FCIQMC energies is approximately 
100 times smaller than previously^. 

V. CONCLUDING REMARKS 

In this paper we have applied z-FCIQMC , to the sim- 
ulation cell HEG at a variety of system sizes, N = 14, 38 
and 54 electrons, over a range of correlation strengths 
0.5 < r s < 5.0 a.u. 

We develop the use of a fixed shift strategy to exam- 
ine convergence of the calculations to the large walker 
limit. The i-FCIQMC method has associated with it 
two sources of error when trying to calculate FCI ac- 
curacy energies. These are stochastic error, arising from 
the evolution of the discretized wavefunction coefficients 
through imaginary time, and initiator error, arising from 
using a finite number of walkers for the simulation. We 
investigated these two sources of error and showed that, 
with a small modification to the current algorithm, they 
can be independently reduced, removed or quantified sys- 
tematically. In so doing we also gave an explanation of 
the internal parameters within an z-FCIQMC simulation 
and how these can be optimized for computational effi- 
ciency. 

Making use of the easily-tunable basis set of the HEG, 
we demonstrated that the basis set scaling for the very 
large basis sets of a weakly correlated system (JV = 14, 
r s = 1.0 a.u.) was approximately O [M]. We could find 
no evidence in the very high basis set limit for an ex- 
ponential scaling, as previous studies of molecular sys- 
tems have identifie d 13 ' 15 , although note that our conclu- 
sion would almost certainly change with system size and 
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strength of correlation. 

Finally, we applied the newly-developed single point 
extrapolation for the projected energy, which uses infor- 
mation from a single large-basis-set calculation to extrap- 
olate to the complete basis set limit, and successfully 
yield complete basis set energies for a range of N and r s . 
We note that in combination with the fixed shift strategy, 
this leads to a 100 fold saving in computational cost in 
producing complete basis set energies for the 54 electron 
problem^. 

In so doing we hope that we have demonstrated both 
that the HEG is a versatile and useful model system, pro- 
viding benchmarks for the future application of quantum 
chemical techniques, and also more rigorously analyzed 



some of the open methodological questions surrounding 
i-FCIQMC. 
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